Speech recognition using voice-characteristic-dependent acoustic models

نویسندگان

Hiroyuki Suzuki

Heiga Zen

Yoshihiko Nankaku

Chiyomi Miyajima

Keiichi Tokuda

Tadashi Kitamura

چکیده

This paper proposes a speech recognition technique based on acoustic models considering voice characteristic variations. Context-dependent acoustic models, which are typically triphone HMMs, are often used in continuous speech recognition systems. This work hypothesizes that the speaker voice characteristics that humans can perceive by listening are also factors in acoustic variation for construction of acoustic models, and a tree-based clustering technique is also applied to speaker voice characteristics to construct voice-characteristic-dependent acoustic models. In speech recognition using triphone models, the neighboring phonetic context is given from the linguistic-phonetic knowledge in advance; in contrast, the voice characteristics of input speech are unknown in recognition using voice-characteristic-dependent acoustic models. This paper proposes a method of recognizing speech even under conditions where the voice characteristics of the input speech are unknown. The result of a gender-dependent speech recognition experiment shows that the proposed method achieves higher recognition performance in comparison to conventional methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Continuous Speech Recognition Based on General Factor Dependent Acoustic Models

This paper describes continuous speech recognition incorporating the additional complement information, e.g., voice characteristics, speaking styles, linguistic information and noise environment, into HMM-based acoustic modeling. In speech recognition systems, contextdependent HMMs, i.e., triphone, and the tree-based context clustering have commonly been used. Several attempts to utilize not on...

متن کامل

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Towards age-independent acoustic modeling

In automatic speech recognition applications, due to significant differences in voice characteristics, adults and children are usually treated as two population groups, for which different acoustic models are trained. In this paper, age-independent acoustic modeling is investigated in the context of large vocabulary speech recognition. Exploiting a small amount (9 hours) of children’s speech an...

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Speech recognition using voice-characteristic-dependent acoustic models

نویسندگان

چکیده

منابع مشابه

Continuous Speech Recognition Based on General Factor Dependent Acoustic Models

Allophone-based acoustic modeling for Persian phoneme recognition

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Towards age-independent acoustic modeling

Voice-based Age and Gender Recognition using Training Generative Sparse Model

عنوان ژورنال:

اشتراک گذاری